## Loading required package: rjags
## Loading required package: coda
## Linked to JAGS 4.3.2
## Loaded modules: basemod,bugs
In this exercise we will apply the classic Kalman Filter (KF) algorithm to the Google Flu Trends data we previously used to explore the state-space model. Unlike the previous exercise that fit a single time-series, we’ll utilize the matrix version of the Kalman filter to look at the flu across New England.
In the multivariate version of the KF, the connection between state variables in the Analysis step is provided in two ways: (1) through interactions in the process model itself, \(MPM^T\), and (2) through the covariance in the process error, \(Q\). In this assignment we’ll assimilate all 4 combinations of with/without interactions in the process model versus with/without correlation in the process error to evaluate how each impacts the inferences made. Since the KF will always be following the data, where we’ll see the largest impact of these choices will be in the differences in the state uncertainties, especially in the periods of missing data.
To begin, let’s load and plot the flu data for New England.
## load the Google flu data & select states
gflu = read.csv("data/gflu_data.txt",skip=11)
time = as.Date(gflu$Date)
states = c("Massachusetts","Connecticut","Rhode.Island","New.Hampshire","Vermont","Maine")
nstates = length(states)
y = t(gflu[,states])
## plot time-series from states
plot(time,1:length(time),type='n',ylab="Flu Index",lwd=2,log='y',ylim=range(y,na.rm=TRUE))
for(i in 1:nstates){
lines(time,y[i,],col=i,lwd=2)
}
legend("topleft",legend=states,lwd=2,col=1:nstates)
Kalman does not estimate parameters, so we will used parameters that were previously estimated by fitting a state space model to the data. In a real-world situation you wouldn’t fit two models to the same data (double dipping!), but rather you could fit a state-space model to the previous data and then use an operational forecast moving forward. Alternatively, you might augment the state matrix in the KF to include both the model states and the model parameters. However, for the classic KF, this approach is limited to only being able to estimate parameters that can be written as linear models of the augmented state + variable matrix M. Therfore, you are limited to estimating variables in the process model, f(X), not the parameters in the Observation Error or Process Error matrices. For the Kalman Filter exercise today we will be using estimates of these variance parameters, not the states, to inform the KF. Keep in mind that the KF is now treating these as KNOWN and thus ignoring parameter uncertainty.
In our previous model we assumed a Random Walk which we just fit Massachussetts. For this version we’ll keep working with a Random Walk but we’ll need to add a spatial contagious process to the random-walk process model. In other words, we’re positing that part of the reason that we see such strong correlations across-states is that infected individuals are able to move across state boundaries and infect individuals in adjacent states. To run such a model we’ll need to define a matrix that defines the adjacency between states, where 1 = adjacent, 0 = not adjacent, and the states are in the order: Massachusetts, Connecticut, Rhode.Island, New.Hampshire, Vermont, Maine.
## define adjacency between states slected
adj = matrix(c(0,1,1,1,1,0, ### state-to-state spatial adjacency (self=0)
1,0,1,0,0,0,
1,1,0,0,0,0,
1,0,0,0,1,1,
1,0,0,1,0,0,
0,0,0,1,0,0),nstates,nstates,byrow=TRUE)
To be more specific, lets assume a simple flux process just based on adjacency, and ignore differences in how population size, border length, transporation corridors, etc. affect the movement of individuals among the New England states.
\(X_{i,t+1} = X_{i,t} + \alpha*\sum(adj_{i,j}*(X_{j,t}-X_{i,t}))+\epsilon_{i,t}\)
Thus, if state j has more cases than state i, this will tend to increase infection in state i. For your reference, below is the JAGS model fit to the log-transformed flu data
SpatialRandomWalk = "
model{
#### Data Model
for(t in 1:n){
for(i in 1:nstate){
y[i,t] ~ dnorm(x[i,t],tau_obs)
}
}
#### Process Model
for(t in 2:n){
for(i in 1:nstate){
mu[i,t] <- x[i,t-1] + ## current state
alpha * sum(adj[i,1:nstate]*x[1:nstate,t-1]) - ## plus flux in
alpha * sum(adj[1:nstate,i])*x[i,t-1] ## minus flux out
}
x[1:nstate,t] ~ dmnorm(mu[1:nstate,t],Omega_proc)
}
#### Priors
for(i in 1:nstate){
x[i,1] ~ dnorm(x_ic,tau_ic)
}
tau_obs ~ dgamma(a_obs,r_obs)
Omega_proc ~ dwish(R,k)
alpha ~ dbeta(1,20)
}
"
And the parameters estimated from the model
## load parameters (assume known)
load("data/KFalpha.params.Rdata")
## observation error
tau_obs
## tau_obs
## 0.005094734
## process error covariance
knitr::kable(tau_proc,col.names = states)
| Massachusetts | Connecticut | Rhode.Island | New.Hampshire | Vermont | Maine |
|---|---|---|---|---|---|
| 0.0130656 | 0.0083375 | 0.0064081 | 0.0071263 | 0.0046522 | 0.0059065 |
| 0.0083375 | 0.0136074 | 0.0065342 | 0.0073813 | 0.0047711 | 0.0060086 |
| 0.0064081 | 0.0065342 | 0.0124185 | 0.0081431 | 0.0054649 | 0.0068645 |
| 0.0071263 | 0.0073813 | 0.0081431 | 0.0154499 | 0.0060621 | 0.0076979 |
| 0.0046522 | 0.0047711 | 0.0054649 | 0.0060621 | 0.0107488 | 0.0052666 |
| 0.0059065 | 0.0060086 | 0.0068645 | 0.0076979 | 0.0052666 | 0.0118565 |
## process error correlation
knitr::kable(cov2cor(tau_proc),col.names = states)
| Massachusetts | Connecticut | Rhode.Island | New.Hampshire | Vermont | Maine |
|---|---|---|---|---|---|
| 1.0000000 | 0.6252902 | 0.5030751 | 0.5015754 | 0.3925707 | 0.4745606 |
| 0.6252902 | 1.0000000 | 0.5026571 | 0.5090742 | 0.3945061 | 0.4730532 |
| 0.5030751 | 0.5026571 | 1.0000000 | 0.5878879 | 0.4730061 | 0.5657167 |
| 0.5015754 | 0.5090742 | 0.5878879 | 1.0000000 | 0.4704107 | 0.5687673 |
| 0.3925707 | 0.3945061 | 0.4730061 | 0.4704107 | 1.0000000 | 0.4665257 |
| 0.4745606 | 0.4730532 | 0.5657167 | 0.5687673 | 0.4665257 | 1.0000000 |
## process error SD
sqrt(diag(tau_proc))
## [1] 0.1143047 0.1166508 0.1114383 0.1242976 0.1036765 0.1088874
Now that we have estimates for our parameters, let’s write functions that evaluates the classic Kalman Filter. Note, if you were running the KF in ‘operational’ mode, where new data is arriving in real time, you wouldn’t run the function all at once but rather just call the KalmanAnalysis every time new data is observed, followed by KalmanForecast to make a new forecast.
##' Kalman Filter
##' @param M = model matrix
##' @param mu0 = initial condition mean vector
##' @param P0 = initial condition covariance matrix
##' @param Q = process error covariance matrix
##' @param R = observation error covariance matrix
##' @param Y = observation matrix (with missing values as NAs), time as col's
##'
##' @return list
##' mu.f, mu.a = state mean vector for (a)nalysis and (f)orecast steps
##' P.f, P.a = state covariance matrix for a and f
KalmanFilter <- function(M,mu0,P0,Q,R,Y){
## storage
nstates = nrow(Y)
nt = ncol(Y)
mu.f = matrix(NA,nstates,nt+1) ## forecast mean for time t
mu.a = matrix(NA,nstates,nt) ## analysis mean for time t
P.f = array(NA,c(nstates,nstates,nt+1)) ## forecast variance for time t
P.a = array(NA,c(nstates,nstates,nt)) ## analysis variance for time t
## initialization
mu.f[,1] = mu0
P.f[,,1] = P0
I = diag(1,nstates)
## run updates sequentially for each observation.
for(t in 1:nt){
## Analysis step: combine previous forecast with observed data
KA <- KalmanAnalysis(mu.f[,t],P.f[,,t],Y[,t],R,H=I,I)
mu.a[,t] <- KA$mu.a
P.a[,,t] <- KA$P.a
## Forecast step: predict to next step from current
KF <- KalmanForecast(mu.a[,t],P.a[,,t],M,Q)
mu.f[,t+1] <- KF$mu.f
P.f[,,t+1] <- KF$P.f
}
return(list(mu.f=mu.f,mu.a=mu.a,P.f=P.f,P.a=P.a))
}
##' Kalman Filter: Analysis step
##' @param mu.f = Forecast mean (vector)
##' @param P.f = Forecast covariance (matrix)
##' @param Y = observations, with missing values as NAs) (vector)
##' @param R = observation error covariance (matrix)
##' @param H = observation matrix (maps observations to states)
KalmanAnalysis <- function(mu.f,P.f,Y,R,H,I){
obs = !is.na(Y) ## which Y's were observed?
if(any(obs)){
H <- H[obs,] ## observation matrix
K <- P.f %*% t(H) %*% solve(H%*%P.f%*%t(H) + R[obs,obs]) ## Kalman gain
mu.a <- mu.f + K%*%(Y[obs] - H %*% mu.f) ## update mean
P.a <- (I - K %*% H)%*%P.f ## update covariance
## Note: Here's an alternative form that doesn't use the Kalman gain
## it is less efficient due to the larger number of matrix inversions (i.e. solve)
## P.a <- solve(t(H)%*%solve(R[obs,obs])%*%(H) + solve(P.f))
## mu.a <- P.a %*% (t(H)%*%solve(R[obs,obs])%*%Y[obs] + solve(P.f)%*%mu.f)
} else {
##if there's no data, the posterior is the prior
mu.a = mu.f
P.a = P.f
}
return(list(mu.a=mu.a,P.a=P.a))
}
##' Kalman Filter: Forecast Step
##' @param mu.a = analysis posterior mean (vector)
##' @param P.a = analysis posterior covariance (matrix)
##' @param M = model (matrix)
##' @param Q = process error covariance (matrix)
KalmanForecast <- function(mu.a,P.a,M,Q){
mu.f = M%*%mu.a
P.f = Q + M%*%P.a%*%t(M)
return(list(mu.f=mu.f,P.f=P.f))
}
With the Kalman Filter function defined, we need to define the inputs to the function. Note below that I’m using the variable KF00 to store the outputs, where I’m using 00 to indicate that this run was done with the defaults for both the process model and process error covariance. In the assignment below you will rerun this analysis under a number of alternatives varying the process error and the magnitude of spatial flux in the process model.
## log transform data
Y = log10(y)
## options for process model
alpha = 0 ## assume no spatial flux
#alpha = 0.05 ## assume a large spatial flux
M = adj*alpha + diag(1-alpha*apply(adj,1,sum)) ## random walk with flux
## options for process error covariance
#Q = tau_proc ## full process error covariance matrix
Q = diag(diag(tau_proc)) ## diagonal process error matrix
## observation error covariance (assumed independent)
R = diag(tau_obs,nstates)
## prior on first step, initialize with long-term mean and covariance
mu0 = apply(Y,1,mean,na.rm=TRUE)
P0 = cov(t(Y),use="pairwise.complete.obs")
#w <- P0*0+0.25 + diag(0.75,dim(P0)) ## iptional: downweight covariances in IC
#P0 = P0*w
## Run Kalman Filter
KF00 = KalmanFilter(M,mu0,P0,Q,R,Y)
After running the Kalman Filter, we can visualize the outputs. The first set of figures below shows the posterior analysis for each state through time. The second set shows the forecast and analysis standard deviations change through time, indicating when there is missing data in green on the bottom of the plot. As you can see the missing data is not synchronous across states, but the mean of the Analysis is influenced by the across-state covariances.
attach(KF00)
nt = length(time)
### plot ANALYSIS mean & CI time-series
par(mfrow=c(3,1))
for(i in 1:6){
ci = rbind(mu.a[i,]-1.96*sqrt(P.a[i,i,]),mu.a[i,]+1.96*sqrt(P.a[i,i,]))
plot(time,mu.a[i,],ylim=range(ci,na.rm=TRUE),type='n',main=states[i])
ecoforecastR::ciEnvelope(time,ci[1,],ci[2,],col="lightBlue")
lines(time,mu.a[i,],col=4)
lines(time,Y[i,])
}
## plot ANALYSIS and FORECAST variance time-series
par(mfrow=c(3,1))
for(i in 1:6){
plot(time,sqrt(P.a[i,i,]),ylim=c(0,sqrt(max(c(P.a[i,i,],P.f[i,i,])))),main=states[i],xlab="Time",
ylab="Std Error",type='l')
lines(time,sqrt(P.f[i,i,1:nt]),col=2)
points(time[is.na(Y[i,])],rep(0,nt)[is.na(Y[i,])],pch="*",col=3) ## flag's the zero's
legend("topright",legend=c("Analysis","Forecast","NAs"),col=1:3,lty=c(1,1,NA),pch=c(NA,NA,1),cex=1.4)
}
Finally, to get a better idea about the dynamics of how the Kalman Filter works we can zoom in to a subset of time for one state and show the Forecast, Analysis, and observed data altogether.
## subset time
time2 <- time[time>as.Date("2015-01-01")]
tsel <- which(time %in% time2)
n = length(time2)*2
## interleave Forecast and Analysis
mu = p = rep(NA,n)
mu[seq(1,n,by=2)] = mu.f[1,tsel]
mu[seq(2,n,by=2)] = mu.a[1,tsel]
p[seq(1,n,by=2)] = 1.96*sqrt(P.f[1,1,tsel])
p[seq(2,n,by=2)] = 1.96*sqrt(P.a[1,1,tsel])
ci = cbind(mu-p,mu+p)
time3 = sort(c(time2,time2+1))
## plot Forecast, Analysis, and data
plot(time3,mu,ylim=range(ci),type='l')
ecoforecastR::ciEnvelope(time3,ci[,1],ci[,2],col="lightBlue")
lines(time3,mu,lwd=2)
points(time,Y[1,])
Run the KF under all four combinations of covariance in the process model versus process error and compare the results. In particular you’ll want to pay attention to the missing data at the beginning of the timeseries for some states. You’ll also want to comment on how spatial adjacency affects the confidence in the inferences (some states are more isolated than others) in the four different scenarios. Finally, you’ll want to note that the alpha estimated from the data itself (0.000209), is close to zero and thus our real forecast would be much more like our no-flux run than our high flux run.
Task 1, KF00 [A]: Run the default analysis, KF00, with no spatial flux in the process model, and no spatial covariance in the process error
attach(KF00)
## The following objects are masked from KF00 (pos = 3):
##
## mu.a, mu.f, P.a, P.f
nt = length(time)
### plot ANALYSIS mean & CI time-series
par(mfrow=c(3,1))
for(i in 1:6){
ci = rbind(mu.a[i,]-1.96*sqrt(P.a[i,i,]),mu.a[i,]+1.96*sqrt(P.a[i,i,]))
plot(time,mu.a[i,],ylim=range(ci,na.rm=TRUE),type='n',main=states[i])
ecoforecastR::ciEnvelope(time,ci[1,],ci[2,],col="lightBlue")
lines(time,mu.a[i,],col=4)
lines(time,Y[i,])
}
## plot ANALYSIS and FORECAST variance time-series
par(mfrow=c(3,1))
for(i in 1:6){
plot(time,sqrt(P.a[i,i,]),ylim=c(0,sqrt(max(c(P.a[i,i,],P.f[i,i,])))),main=states[i],xlab="Time",
ylab="Std Error",type='l')
lines(time,sqrt(P.f[i,i,1:nt]),col=2)
points(time[is.na(Y[i,])],rep(0,nt)[is.na(Y[i,])],pch="*",col=3) ## flag's the zero's
legend("topright",legend=c("Analysis","Forecast","NAs"),col=1:3,lty=c(1,1,NA),pch=c(NA,NA,1),cex=1.4)
}
Task 2, KF01 [A]: Rerun with process error set to the full covariance
matrix of Q, compare the results with the original – what impact does
including covariance in the process error have on the inference?
## log transform data
Y = log10(y)
## options for process model
alpha = 0 ## assume no spatial flux
#alpha = 0.05 ## assume a large spatial flux
M = adj*alpha + diag(1-alpha*apply(adj,1,sum)) ## random walk with flux
## options for process error covariance
Q = tau_proc ## full process error covariance matrix
#Q = diag(diag(tau_proc)) ## diagonal process error matrix
## observation error covariance (assumed independent)
R = diag(tau_obs,nstates)
## prior on first step, initialize with long-term mean and covariance
mu0 = apply(Y,1,mean,na.rm=TRUE)
P0 = cov(t(Y),use="pairwise.complete.obs")
#w <- P0*0+0.25 + diag(0.75,dim(P0)) ## iptional: downweight covariances in IC
#P0 = P0*w
## Run Kalman Filter
KF01 = KalmanFilter(M,mu0,P0,Q,R,Y)
attach(KF01)
## The following objects are masked from KF00 (pos = 3):
##
## mu.a, mu.f, P.a, P.f
## The following objects are masked from KF00 (pos = 4):
##
## mu.a, mu.f, P.a, P.f
nt = length(time)
### plot ANALYSIS mean & CI time-series
par(mfrow=c(3,1))
for(i in 1:6){
ci = rbind(mu.a[i,]-1.96*sqrt(P.a[i,i,]),mu.a[i,]+1.96*sqrt(P.a[i,i,]))
plot(time,mu.a[i,],ylim=range(ci,na.rm=TRUE),type='n',main=states[i])
ecoforecastR::ciEnvelope(time,ci[1,],ci[2,],col="lightBlue")
lines(time,mu.a[i,],col=4)
lines(time,Y[i,])
}
## plot ANALYSIS and FORECAST variance time-series
par(mfrow=c(3,1))
for(i in 1:6){
plot(time,sqrt(P.a[i,i,]),ylim=c(0,sqrt(max(c(P.a[i,i,],P.f[i,i,])))),main=states[i],xlab="Time",
ylab="Std Error",type='l')
lines(time,sqrt(P.f[i,i,1:nt]),col=2)
points(time[is.na(Y[i,])],rep(0,nt)[is.na(Y[i,])],pch="*",col=3) ## flag's the zero's
legend("topright",legend=c("Analysis","Forecast","NAs"),col=1:3,lty=c(1,1,NA),pch=c(NA,NA,1),cex=1.4)
}
#The SE decreased and the CI narrowed, especially in the early period of forecast where NAs exist. This could suggest that the full covariance approach leverage the covariance information to better handle the missing data.
Task 3, KF10 [B]: Rerun with alpha = 0.05 but switch back to the diagonal Q matrix (no spatial covariance). Comparing KF10 to KF00, what impact does including a spatial flux in the process model have on the inference?
## log transform data
Y = log10(y)
## options for process model
#alpha = 0 ## assume no spatial flux
alpha = 0.05 ## assume a large spatial flux
M = adj*alpha + diag(1-alpha*apply(adj,1,sum)) ## random walk with flux
## options for process error covariance
#Q = tau_proc ## full process error covariance matrix
Q = diag(diag(tau_proc)) ## diagonal process error matrix
## observation error covariance (assumed independent)
R = diag(tau_obs,nstates)
## prior on first step, initialize with long-term mean and covariance
mu0 = apply(Y,1,mean,na.rm=TRUE)
P0 = cov(t(Y),use="pairwise.complete.obs")
#w <- P0*0+0.25 + diag(0.75,dim(P0)) ## iptional: downweight covariances in IC
#P0 = P0*w
## Run Kalman Filter
KF10 = KalmanFilter(M,mu0,P0,Q,R,Y)
attach(KF10)
## The following objects are masked from KF01:
##
## mu.a, mu.f, P.a, P.f
## The following objects are masked from KF00 (pos = 4):
##
## mu.a, mu.f, P.a, P.f
## The following objects are masked from KF00 (pos = 5):
##
## mu.a, mu.f, P.a, P.f
nt = length(time)
### plot ANALYSIS mean & CI time-series
par(mfrow=c(3,1))
for(i in 1:6){
ci = rbind(mu.a[i,]-1.96*sqrt(P.a[i,i,]),mu.a[i,]+1.96*sqrt(P.a[i,i,]))
plot(time,mu.a[i,],ylim=range(ci,na.rm=TRUE),type='n',main=states[i])
ecoforecastR::ciEnvelope(time,ci[1,],ci[2,],col="lightBlue")
lines(time,mu.a[i,],col=4)
lines(time,Y[i,])
}
## plot ANALYSIS and FORECAST variance time-series
par(mfrow=c(3,1))
for(i in 1:6){
plot(time,sqrt(P.a[i,i,]),ylim=c(0,sqrt(max(c(P.a[i,i,],P.f[i,i,])))),main=states[i],xlab="Time",
ylab="Std Error",type='l')
lines(time,sqrt(P.f[i,i,1:nt]),col=2)
points(time[is.na(Y[i,])],rep(0,nt)[is.na(Y[i,])],pch="*",col=3) ## flag's the zero's
legend("topright",legend=c("Analysis","Forecast","NAs"),col=1:3,lty=c(1,1,NA),pch=c(NA,NA,1),cex=1.4)
}
#The SE decreased and the CI narrowed, especially in the early period of forecast. The result suggests that the spatial flux approach is realistic in smoothing the estimates across neighboring states.
Task 4, KF11 [A]: Rerun with alpha = 0.05 and the full process error covariance Q matrix. Compare KF11 to the previous runs – what impact does including both a spatial process and a process error covariance have over their impacts individually.
## log transform data
Y = log10(y)
## options for process model
#alpha = 0 ## assume no spatial flux
alpha = 0.05 ## assume a large spatial flux
M = adj*alpha + diag(1-alpha*apply(adj,1,sum)) ## random walk with flux
## options for process error covariance
Q = tau_proc ## full process error covariance matrix
#Q = diag(diag(tau_proc)) ## diagonal process error matrix
## observation error covariance (assumed independent)
R = diag(tau_obs,nstates)
## prior on first step, initialize with long-term mean and covariance
mu0 = apply(Y,1,mean,na.rm=TRUE)
P0 = cov(t(Y),use="pairwise.complete.obs")
#w <- P0*0+0.25 + diag(0.75,dim(P0)) ## iptional: downweight covariances in IC
#P0 = P0*w
## Run Kalman Filter
KF11 = KalmanFilter(M,mu0,P0,Q,R,Y)
attach(KF11)
## The following objects are masked from KF10:
##
## mu.a, mu.f, P.a, P.f
## The following objects are masked from KF01:
##
## mu.a, mu.f, P.a, P.f
## The following objects are masked from KF00 (pos = 5):
##
## mu.a, mu.f, P.a, P.f
## The following objects are masked from KF00 (pos = 6):
##
## mu.a, mu.f, P.a, P.f
nt = length(time)
### plot ANALYSIS mean & CI time-series
par(mfrow=c(3,1))
for(i in 1:6){
ci = rbind(mu.a[i,]-1.96*sqrt(P.a[i,i,]),mu.a[i,]+1.96*sqrt(P.a[i,i,]))
plot(time,mu.a[i,],ylim=range(ci,na.rm=TRUE),type='n',main=states[i])
ecoforecastR::ciEnvelope(time,ci[1,],ci[2,],col="lightBlue")
lines(time,mu.a[i,],col=4)
lines(time,Y[i,])
}
## plot ANALYSIS and FORECAST variance time-series
par(mfrow=c(3,1))
for(i in 1:6){
plot(time,sqrt(P.a[i,i,]),ylim=c(0,sqrt(max(c(P.a[i,i,],P.f[i,i,])))),main=states[i],xlab="Time",
ylab="Std Error",type='l')
lines(time,sqrt(P.f[i,i,1:nt]),col=2)
points(time[is.na(Y[i,])],rep(0,nt)[is.na(Y[i,])],pch="*",col=3) ## flag's the zero's
legend("topright",legend=c("Analysis","Forecast","NAs"),col=1:3,lty=c(1,1,NA),pch=c(NA,NA,1),cex=1.4)
}
#The early stage forecast SE decrease further for both analysis and forecast. The CI is narrower and traces the analysis mean better than the previous runs.
Task 5 [A]: In a true forecasting situation you don’t have all the data in hand at once. You also often want to make a forecast that is farther than one time-step into the future. Write an R function you could run daily that would:
You should leverage the existing KalmanAnalysis and
KalmanForecast functions in your new function, and you
shouldn’t need to change anything in or about those functions.
Note: in a real world situation, where your forecast model has driver/covariate data, you would actually want to first re-run the forecast from yesterday to today with the actual (now observed) driver/covariate data, rather than using your archived forecast (which was done based on your forecasted covariate/driver data), before assimilating today’s new observations of your response data.
dailyupdate <- function(mu_prev, P_prev, new_data, M, Q, R, steps_ahead = 16) {
#update
if(!is.na(new_data)) {
update <- KalmanAnalysis(mu_prev, P_prev, new_data, R, diag(1, length(mu_prev), length(mu_prev)))
mu_now <- update$mu.a
P_now <- update$P.a
} else {
mu_now <- mu_prev
P_now <- P_prev
}
#storage
mu_forecasts <- matrix(nrow = length(mu_now), ncol = time_steps + 1)
P_forecasts <- array(dim = c(length(mu_now), length(mu_now), time_steps + 1))
mu_forecasts[, 1] <- mu_now
P_forecasts[, , 1] <- P_now
#forecast
for (t in 2:(time_steps + 1)) {
forecast <- KalmanForecast(mu_forecasts[, t-1], P_forecasts[, , t-1], M, Q)
mu_forecasts[, t] <- forecast$mu.f
P_forecasts[, , t] <- forecast$P.f
}
return(list(mu = mu_forecasts, P = P_forecasts))
}
Task 6 [C: Extra Credit]: Using the run with alpha=0.05 and full process error covariance Q, apply your forecast function to make, and visualize, 5 iterative forecasts. Hint: when visualizing, remember that each forecast starts from a different day.